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ABSTRACT  The  problem  of  deciding  whether  the  mean  of  an  unknown  distribution  is  in  a  set 
A  or  in  its  complement  based  on  a  sequence  of  independent  random  variables  drawn  according  to 
this  distribution  is  considered.  Using  large  deviations  techniques,  an  algorithm  is  proposed  which 
is  shown  to  lead  to  an  a.s.  correct  decision  for  a  class  of  A  which  are  not  necessarily  countable. 
A  refined  decision  procedure  is  also  presented  which,  given  a  countable  decomposition  of  A,  can 
determine  a.s.  to  which  set  of  the  decomposition  the  mean  belongs.  This  extends  and  simplifies  a 
construction  by  Cover. 
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1  Introduction 


Consider  the  following  hypothesis  testing  problem:  Let  *1,  *2,  ■  ■  •  denote  a  sequence  of  i.i.d. 
random  variables  with  marginal  law  Pt,  with  support  [0, 1].  The  mean  of  Pt,  denoted  JIT,  is 
known  to  belong  either  to  a  (known)  set  A  which  has  measure  0  or  to  its  complement  B  =  Ac.  We 
want  to  decide,  based  on  the  observation  sequence  x\,  xi,  •  •  • ,  xn  whether  JIT  £  A  or  not. 

This  problem  was  considered  by  Cover  in  [1],  where  he  treated  the  case  of  A  =  Q[o,i]>  the  set  of 
rationals  in  [0, 1],  and  more  generally  the  case  of  countable  A.  He  proposed  there  a  test  which,  for 
any  measure  with  JiT  £  A ,  will  make  (a.s.)  only  a  finite  number  of  mistakes  whereas,  for  measures 
with  JiT  £  B\N ,  the  test  makes  (a.s.)  only  a  finite  number  of  mistakes,  where  N  is  a  set  of  Lebesgue 
measure  0.  Various  extensions  of  this  result  were  considered  by  Koplowitz  [3],  who  showed  various 
properties  of  sets  A  which  allow  for  such  a  decision  and  gave  some  characterizations  of  the  set  N. 

In  this  note,  we  extend  the  result  of  [1]  by  allowing  the  set  A  to  be  uncountable,  not  necessarily 
of  measure  0,  such  that  it  satisfies  the  following  structural  assumption: 

Assumption  There  exists  a  monotone  sequence  of  sets  Am  increasing  to  A  and  an  appropriate 

positive  sequence  e(m)  <»  0  such  that,  for  each  m,  the  open  blow  up  Bm  =  . 

d(x,  Am)  <  a/2 e(m)}  is  such  that  the  Lebesgue  measure  of  Bm\Am  is  smaller  than  1/m2.  (We  will 
use  the  fact  that  the  open  blow  ups  Bm  satisfy  (d(Am,  B^))2  >  2e(m)  >  0.) 

We  note  that  this  Assumption  implies  that  if  A  has  Lebesgue  measure  zero,  it  is  of  the  first 
category  (i.e.,  a  countable  union  of  nowhere  dense  sets).  The  Assumption  is  satisfied  by  a  class  of 
interesting  uncountable  sets  A ,  e.g.  the  Cantor  set.  Obviously,  for  countable  sets,  the  Assumption 
is  satisfied.  For  more  along  these  lines,  c.f.  Lemma  2  and  the  remarks  which  follow  Theorem  1. 

In  Section  2,  we  describe  a  decision  algorithm  which  changes  its  decisions  after  increasingly 
longer  and  longer  intervals.  Those  intervals  are  chosen  using  entropy  bounds.  We  prove  that  this 
algorithm  shares  the  properties  of  Cover’s  decision  rule,  i.e.  it  makes  a  finite  number  of  mistakes 
a.s.  on  the  set  A  and  on  AC\N  for  an  appropriate  set  N  of  Lebesgue  measure  0.  (A  characterization 
of  N  follows  from  our  proof  and  is  related  to  the  one  given  in  [3]).  In  Section  3,  the  results  are 
extended  to  allow  a  (countable)  sub-decision  inside  the  set  A. 

2  The  decision  rule  and  proof  of  the  main  theorem 

We  begin  by  first  describing  the  proposed  decision  rule.  Let  j3(m)  be  a  given  sequence,  to  be 
defined  below.  For  any  input  sequence  x\,  •  •  • ,  xn,  form  the  subsequences 

■X  =(®/3(m— 1)? '  '  '  i  */3(m)— l)' 

Let  denote  the  empirical  mean  of  the  sequence  Xm.  At  the  end  of  each  parsing,  make  a 
decision  whether  Jlt  £  A  according  to  whether  /Ty  £  Bm  or  not.  Between  parsings,  don’t  change 
the  decision.  For  the  sequence  /3(m)  defined  below  in  equation  (2.7),  we  claim: 

Theorem  1 

a)  For  any  measure  Pt  with  JiT  €  A ,  the  decision  rule  will  make  (a.s.)  only  a  finite  number  of 
mistakes,  i.e.  for  a.e.  there  exists  an  n(u>)  such  that  the  decision  is  A  for  all  n  >  n(u?). 
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b)  For  any  measure  Pj  with  J1T  6  AC\N,  where  N  is  a  set  of  Lebesgue  measure  0,  the  decision 
rule  will  make  (a.s.)  only  a  finite  number  of  mistakes,  i.e.  for  a.e.  u>  there  exists  an  n(u;)  such 
that  the  decision  is  Ac  for  all  n  >  n(w). 

Before  proving  the  theorem,  we  introduce  some  notation  and  define  the  sequence  /3 (m).  For 
a  set  E  C  [0, 1],  Ec  denotes  the  complement  of  E  and  E  denotes  the  closure  of  E,  whereas  E° 
denotes  the  interior  of  E,  Let  n  be  a  probability  measure  with  support  in  [0,1].  The  mean  of 
/i  is  denoted  Ji.  Let  MM(A)=2?M(exp(Aa;))  denote  the  moment  generating  function  of  p  and  let 
A(A)=log(M(A)).  Let  I^x)  =  supA(A®  —  A(A))  be  the  Legendre  transform  of  A(A),  and  let  H{v\y) 
denote  the  relative  entropy  of  v  with  respect  to  /j,  i.e.  H{v\fi)  —  /0X  du{x)\og{^E)  if  ^  exists  and 
oo  otherwise.  It  is  known  that  both  I(x)  and  H(i/\fi)  are  convex,  lower  semicontinuous  functions 
(e.g,  see  [2]).  Further,  it  is  well  known  that  for  any  open  (closed)  set  C  in  [0, 1], 

inf  -M*)  =  ,  inf  (2.1) 

x€  {i/:/o  »*/(xn)eC} 


Next,  let  j5n—  n  Sr=i  *i  denote  the  empirical  mean  of  the  sequence  *i,*2>  •  •  *,®n>  and  let 
L„= A  6Xi  denote  the  empirical  measure  of  the  sequence  x\ ,  X2,  •  •  • ,  xn.  By  the  classical  Cramer 
theorem,  one  has  that,  for  any  closed  set  C ,  and  any  probability  measure  /i  with  support  in  [0, 1], 
(c.f.  [2,  proof  of  Lemma  1.2.5]), 

PM(/Zn  eC)<  2exp(-7i  inf  /^(a;)).  (2.2) 


We  next  define  the  sequence  /3 (m):  for  any  m,  let  Bm  be  the  open  cover  of  the  set  Am  described 
in  the  Assumption  above.  For  any  m,  compute 


Im=  inf  inf  /„(*)• 

{fi-./iGAm}  x€Bm 


(2.3) 


Note  that  by  (2.1),  one  also  has  that 


inf  inf  H(v\u). 


(2.4) 


Since  d(Am,  5^)2  >  2 e(m),  one  has  that  Im  >  e(m).  Indeed,  by  [2,  Exercise  3.2.24],  2H{v\n)  > 
\\v  -  n\\lar  >  (d(Am,B^n))2,  where  the  last  inequality  holds  for  {v  :  G  Bcm}  and  E  Am}. 

Next,  let 


,  .  .  log  2  +  2  log  m 
a(m)= - - - 

1m 


(2.5) 


Note  that,  by  (2.2),  for  any  n  such  that  p  €E  Am, 


Pn(Va(m)  £BCm)< 


(2.6) 


Finally,  let 

m 

/3(m)  =  2>(t),  /3(0)  =  0.  (2.7) 

i— 1 

Proof  of  Theorem  1 
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a)  Assume  JiT  £  A.  Then  there  exists  an  m  such  that  ~pT  £  Am.  Note  however  that  the  event  of 
making  an  error  infinitely  often  is  equivalent  to  the  event  of  making  an  error  at  the  parsing 
intervals  infinitely  often.  However, 


E 


oo  j 

Prob  error  in  m-th  parsing  <  V'  — -  <  oo 

“  m2 

m—1 


where  we  have  used  (2.6)  above.  Therefore,  part  a)  of  the  theorem  follows  by  the  Borel- 
Cantelli  lemma. 


b)  Let  Cm  denote  the  2y/2e(m)  blow  up  of  Bm.  Let 


OO  OO 

n  =  n  u  ^\a 

n=l  m=n 


Clearly,  the  Lebesgue  measure  of  N  is  zero.  Now  we  may  repeat  the  arguments  of  part  a) 
in  the  following  way:  let  JiT  £  B\N.  For  an  mo  large  enough,  ~pT  £  for  all  m  >  mo. 
On  the  other  hand,  d(jiT,  Bm)2  >  2 e(m)  by  our  construction.  Noting  that  the  rate  function 
infj.esm  IpT(x)  >  e(m),  the  proof  follows  identically  as  in  part  a). 


□ 


Remarks 

1)  The  theorem  could  have  been  proved  by  obtaining  (2.6)  using  more  traditional  bounds  but 
with  a  slower  decision  procedure  (i.e.,  larger  a(m)). 

2)  It  is  interesting  to  note  that  the  Cantor  set  satisfies  the  Assumption.  Indeed,  the  covering 
sets  Bm  are  just  the  intervals  associated  with  the  Cantor  partition. 

3)  By  modifying  the  structure  of  the  decision  rule,  one  may  also  make  a  hypothesis  test  inside 
A.  This  is  pursued  in  Section  3. 

We  conclude  this  section  by  a  (partial)  characterization  of  the  sets  A  of  measure  0  which  satisfy 
the  Assumption: 

Lemma 

A  set  A  which  is  of  measure  0  and  which  satisfies  the  Assumption  is  of  the  first  category  (i.e., 
A  is  a  countable  union  of  nowhere  dense  sets).  Conversely,  a  closed  set  A  of  Lebesgue  measure 
zero  satisfies  the  Assumption  if  A  is  of  the  first  category. 

Proof 

(=>)  From  the  Assumption,  A  =  UmAm.  We  need  only  show  that  each  Am  is  nowhere  dense. 
But  this  follows  immediately  from  the  existence  of  a  sequence  of  open  blow  ups  of  Am  with  arbi¬ 
trarily  small  Lebesgue  measure  (namely,  2?*.  for  k  >  m). 

(•$=)  If  A  is  of  the  first  category  then  A  =  U,5;  where  each  5,  is  nowhere  dense.  Let  Am  = 
Clearly,  the  Am  monotonically  increase  to  A.  Also,  since  Am  is  nowhere  dense,  and  A  is 
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closed,  |A^|  — ►  0  as  S  — >  0  where  |  •  |  denotes  Lebesgue  measure  and  4^  =  {x  :  d(x,Am)  <  £}  is 
the  (open)  ^-neighborhood  of  Am.  For  each  m,  choose  any  Sm  >  o  such  that  <  1/m2.  Then 

the  Assumption  is  satisfied  with  Bm  =  Am™^  and  e(m)  =  S^/2. 


□ 

We  note  that,  by  a  counter  example  based  on  [4,  Exercise  4,  pg.  66],  one  cannot  in  general 
dispense  of  the  requirement  that  A  be  closed  in  the  converse  direction  of  Lenmia  2.  Indeed,  in  [4]  a 
set  F  of  nonzero  Lebesgue  measure  is  constructed  which  is  nowhere  dense.  To  get  a  contradiction, 
it  now  suffices  to  take  a  countable  dense  subset  of  this  set  F  to  be  any  of  the  sets  Am. 


3  Countable  hypothesis  testing 

In  this  section,  we  refine  the  decision  rule  to  allow  for  deciding  among  a  countable  set  of 
hypotheses.  In  addition  to  deciding  whether  or  not  ~pT  £  A ,  we  also  make  a  hypothesis  test  inside 
A.  Suppose  that  A  is  written  as  A  —  UgjSi  where  the  5;  are  disjoint.  We  are  interested  not  only 
in  whether  JiT  £  A,  but  if  so  to  which  of  the  5,  does  ~pT  belong.  Specifically,  we  wish  to  decide 
among  the  following  countable  set  of  hypotheses: 

Fj  :  €  'S'i,  i  —  1,2,... 

Ho  •  ^ 

For  the  theorem  below,  restrictions  must  be  placed  on  the  decomposition  of  A.  Namely,  we  assume 
that  the  Si  are  pairwise  positively  separated  meaning  that  d(Si,Sj)  >  0  for  every  i  ^  j.  (Note 
that,  as  before,  A  is  required  to  satisfy  the  structured  Assumption  of  the  introduction.) 

We  modify  our  previous  decision  rule  as  follows.  At  the  end  of  each  parsing  (defined  by  the 
sequence  /3(m)),  find  the  least  index  k  (if  one  exists)  such  that  is  contained  in  the  y/2e(m) 

open  blow  up  of  Sk  D  Am.  If  such  a  k  exists,  then  decide  that  JiT  £  Sk-  Otherwise  (if  ma(m)  ^ 

(5in4m)(V'2'(m))  £or  aj]  ^  decide  that  ~pT  £  A.  Alternatively,  we  can  think  of  this  decision  procedure 
as  first  deciding  whether  or  not  JlT  €  A  as  before.  Then,  if  the  decision  is  that  JiT  £  A,  make  a 
refinement  by  deciding  that  JIT  €  Sk  where  k  is  the  least  index  such  that  ma(m)  €  (5;fl  Am)(v/2t(m)), 

Theorem  2  If  A  =  Ugj  Si  satisfies  the  Assumption  and  the  Si  are  pairwise  positively  separated 
then 

a)  For  any  measure  Ft  with  JiT  £  S{  for  some  i,  the  decision  rule  will  make  (a.s.)  only  a  finite 
number  of  mistakes,  i.e.  for  a.e.  w  there  exists  an  n( u?)  such  that  the  decision  is  Si  for  all 
n  >  n(u>). 

b)  For  any  measure  Pt  with  JiT  G  AC\N,  where  N  is  a  set  of  Lebesgue  measure  0,  the  decision 
rule  will  make  (a.s.)  only  a  finite  number  of  mistakes,  i.e.  for  a.e.  u;  there  exists  an  n(uj)  such 
that  the  decision  is  Ac  for  all  n  >  n(u>). 

Proof 
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a)  Suppose  that  JJT  £  Si.  By  the  same  considerations  that  led  to  (2.6),  for  any  \i  such  that 
fi  £  Si  fl  Am  we  have 

pm !<*»)  t  (Si  n  AnjOw))  <  _L  (3.8) 

Since  JiT  £  Si  C  A,  for  sufficiently  large  m,  /tr  G  Am.  Also,  since  the  are  pairwise  positively 
separated  and  i  is  finite,  for  large  enough  m  the  sets  (Sj  fl  Arn)^v/ze(m))  and  (Si  fl  ifn)(V2<(m)) 
are  disjoint  for  all  j  <  i.  That  is,  for  sufficiently  large  m,  denoted  mo(*),  as  long  as  pa(m)  £ 

(Si  fl  4ra)('^W)  we  have  pa(m)  $  (5j  fl  J4m)(y/2«(m))  for  an  j  <  j.  Hence,  for  all  m  >  m0(f),  i  is 
the  least  index  satisfying  the  requirements  of  the  decision  procedure  (so  that  a  correct  decision  is 
made)  iff  ~p,a(m)  £  (5»H  Am)(v/2e(m)).  Therefore, 

OO  OO  _ 

y;  Prob  error  in  m-th  parsing  <  m0(i)  + 
m=l  m=mo(i)+l 

OO  j 

<  ”*o(0  +  E  ^  <  00 

m=l 

so  that  part  a)  follows  by  the  Borel-Cantelli  Lemma. 

b)  This  part  is  identical  to  part  b)  of  Theorem  1. 


□ 


Remarks 

1)  Cover’s  result  on  countable  hypothesis  testing  is  a  special  case  of  this  result  since  every 
countable  set  A  clearly  satisfies  the  Assumption  and  can  be  written  as  the  union  of  pairwise 
positively  separated  sets. 

2)  If  one  is  willing  to  allow  the  test  to  fail  for  some  points  in  A,  then  the  requirement  that  the 
Si  be  pairwise  positively  separated  can  be  dropped.  The  set  N2  C  A  on  which  the  test  fails 
in  the  general  case  cam  be  characterized,  and  presumably  conditions  on  the  Si  for  which  N2 
is  a  null  set  could  be  obtained. 
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